Overview
Brought to you by YData
Dataset statistics
| Number of variables | 22 |
|---|---|
| Number of observations | 3881 |
| Missing cells | 25764 |
| Missing cells (%) | 30.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 3.6 MiB |
| Average record size in memory | 985.1 B |
Variable types
| Numeric | 6 |
|---|---|
| Text | 5 |
| Categorical | 11 |
Unnamed: 0 is highly overall correlated with sample_type and 2 other fields | High correlation |
metastatic_site is highly overall correlated with sample_type and 2 other fields | High correlation |
mitotic_rate is highly overall correlated with source and 1 other fields | High correlation |
os_status is highly overall correlated with sample_type and 2 other fields | High correlation |
sample_coverage is highly overall correlated with source and 1 other fields | High correlation |
sample_type is highly overall correlated with Unnamed: 0 and 4 other fields | High correlation |
source is highly overall correlated with Unnamed: 0 and 10 other fields | High correlation |
stage_at_diagnosis is highly overall correlated with source and 2 other fields | High correlation |
treatment is highly overall correlated with os_status and 3 other fields | High correlation |
treatment_response is highly overall correlated with Unnamed: 0 and 2 other fields | High correlation |
tumor_grade is highly overall correlated with metastatic_site and 5 other fields | High correlation |
tumor_purity is highly overall correlated with tumor_grade | High correlation |
tumor_size is highly overall correlated with source and 1 other fields | High correlation |
treatment is highly imbalanced (62.1%) | Imbalance |
primary_site is highly imbalanced (56.1%) | Imbalance |
metastatic_site is highly imbalanced (54.9%) | Imbalance |
sample_id has 2522 (65.0%) missing values | Missing |
age_at_diagnosis has 182 (4.7%) missing values | Missing |
stage_at_diagnosis has 182 (4.7%) missing values | Missing |
tumor_size has 3267 (84.2%) missing values | Missing |
mitotic_rate has 3300 (85.0%) missing values | Missing |
treatment_response has 2541 (65.5%) missing values | Missing |
race has 637 (16.4%) missing values | Missing |
metastatic_site has 3085 (79.5%) missing values | Missing |
tumor_purity has 3037 (78.3%) missing values | Missing |
sample_coverage has 3085 (79.5%) missing values | Missing |
os_months has 710 (18.3%) missing values | Missing |
os_status has 666 (17.2%) missing values | Missing |
mutated_genes has 2545 (65.6%) missing values | Missing |
Unnamed: 0 is uniformly distributed | Uniform |
Unnamed: 0 has unique values | Unique |
mitotic_rate has 43 (1.1%) zeros | Zeros |
Reproduction
| Analysis started | 2025-07-30 02:13:12.967521 |
|---|---|
| Analysis finished | 2025-07-30 02:13:20.454857 |
| Duration | 7.49 seconds |
| Software version | ydata-profiling vv4.16.1 |
| Download configuration | config.json |
Variables
Unnamed: 0
Real number (ℝ)
High correlation  Uniform  Unique 
| Distinct | 3881 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1940 |
| Minimum | 0 |
|---|---|
| Maximum | 3880 |
| Zeros | 1 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 194 |
| Q1 | 970 |
| median | 1940 |
| Q3 | 2910 |
| 95-th percentile | 3686 |
| Maximum | 3880 |
| Range | 3880 |
| Interquartile range (IQR) | 1940 |
Descriptive statistics
| Standard deviation | 1120.4925 |
|---|---|
| Coefficient of variation (CV) | 0.57757347 |
| Kurtosis | -1.2 |
| Mean | 1940 |
| Median Absolute Deviation (MAD) | 970 |
| Skewness | 0 |
| Sum | 7529140 |
| Variance | 1255503.5 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 3880 | 1 | < 0.1% |
| 0 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 2 | 1 | < 0.1% |
| 3 | 1 | < 0.1% |
| 4 | 1 | < 0.1% |
| 5 | 1 | < 0.1% |
| 6 | 1 | < 0.1% |
| 7 | 1 | < 0.1% |
| 3864 | 1 | < 0.1% |
| Other values (3871) | 3871 |
| Value | Count | Frequency (%) |
| 0 | 1 | |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 |
| Value | Count | Frequency (%) |
| 3880 | 1 | |
| 3879 | 1 | |
| 3878 | 1 | |
| 3877 | 1 | |
| 3876 | 1 | |
| 3875 | 1 | |
| 3874 | 1 | |
| 3873 | 1 | |
| 3872 | 1 | |
| 3871 | 1 |
sample_id
Text
Missing 
| Distinct | 1160 |
|---|---|
| Distinct (%) | 85.4% |
| Missing | 2522 |
| Missing (%) | 65.0% |
| Memory size | 173.8 KiB |
Length
| Max length | 17 |
|---|---|
| Median length | 17 |
| Mean length | 14.465048 |
| Min length | 10 |
Unique
| Unique | 1093 ? |
|---|---|
| Unique (%) | 80.4% |
Sample
| 1st row | COSS1030183 |
|---|---|
| 2nd row | COSS1030184 |
| 3rd row | COSS1035469 |
| 4th row | COSS1035470 |
| 5th row | COSS1036012 |
| Value | Count | Frequency (%) |
| p-0001315-t02-im5 | 9 | 0.7% |
| p-0001315-t01-im3 | 9 | 0.7% |
| p-0002276-t01-im3 | 8 | 0.6% |
| p-0012178-t01-im5 | 7 | 0.5% |
| p-0002477-t01-im3 | 7 | 0.5% |
| p-0007513-t03-im5 | 6 | 0.4% |
| p-0004937-t01-im5 | 6 | 0.4% |
| p-0000501-t02-im3 | 6 | 0.4% |
| p-0005066-t01-im5 | 6 | 0.4% |
| p-0004937-t03-im6 | 6 | 0.4% |
| Other values (1150) | 1289 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 3258 | |
| - | 2388 | |
| 1 | 1871 | 9.5% |
| S | 1126 | 5.7% |
| 2 | 1054 | 5.4% |
| 6 | 1053 | 5.4% |
| 3 | 917 | 4.7% |
| 5 | 914 | 4.6% |
| 4 | 836 | 4.3% |
| P | 796 | 4.0% |
| Other values (8) | 5445 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 19658 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 3258 | |
| - | 2388 | |
| 1 | 1871 | 9.5% |
| S | 1126 | 5.7% |
| 2 | 1054 | 5.4% |
| 6 | 1053 | 5.4% |
| 3 | 917 | 4.7% |
| 5 | 914 | 4.6% |
| 4 | 836 | 4.3% |
| P | 796 | 4.0% |
| Other values (8) | 5445 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 19658 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 3258 | |
| - | 2388 | |
| 1 | 1871 | 9.5% |
| S | 1126 | 5.7% |
| 2 | 1054 | 5.4% |
| 6 | 1053 | 5.4% |
| 3 | 917 | 4.7% |
| 5 | 914 | 4.6% |
| 4 | 836 | 4.3% |
| P | 796 | 4.0% |
| Other values (8) | 5445 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 19658 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 3258 | |
| - | 2388 | |
| 1 | 1871 | 9.5% |
| S | 1126 | 5.7% |
| 2 | 1054 | 5.4% |
| 6 | 1053 | 5.4% |
| 3 | 917 | 4.7% |
| 5 | 914 | 4.6% |
| 4 | 836 | 4.3% |
| P | 796 | 4.0% |
| Other values (8) | 5445 |
patient_id
Text
| Distinct | 3301 |
|---|---|
| Distinct (%) | 85.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 247.6 KiB |
Length
| Max length | 36 |
|---|---|
| Median length | 9 |
| Mean length | 8.2826591 |
| Min length | 3 |
Unique
| Unique | 3018 ? |
|---|---|
| Unique (%) | 77.8% |
Sample
| 1st row | 924209 |
|---|---|
| 2nd row | 924209 |
| 3rd row | 929361 |
| 4th row | 929361 |
| 5th row | 929884 |
| Value | Count | Frequency (%) |
| p-0001315 | 18 | 0.5% |
| p-0004937 | 12 | 0.3% |
| 1464316 | 9 | 0.2% |
| p-0000134 | 8 | 0.2% |
| p-0004760 | 8 | 0.2% |
| p-0008084 | 8 | 0.2% |
| p-0002276 | 8 | 0.2% |
| p-0002594 | 8 | 0.2% |
| p-0006104 | 8 | 0.2% |
| p-0001157 | 8 | 0.2% |
| Other values (3291) | 3786 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 4363 | |
| 1 | 4248 | |
| 2 | 3564 | |
| 6 | 2881 | |
| 3 | 2647 | |
| 4 | 2621 | |
| 9 | 2593 | |
| 5 | 2450 | |
| 8 | 2018 | |
| 7 | 1997 | |
| Other values (8) | 2763 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 32145 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 4363 | |
| 1 | 4248 | |
| 2 | 3564 | |
| 6 | 2881 | |
| 3 | 2647 | |
| 4 | 2621 | |
| 9 | 2593 | |
| 5 | 2450 | |
| 8 | 2018 | |
| 7 | 1997 | |
| Other values (8) | 2763 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 32145 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 4363 | |
| 1 | 4248 | |
| 2 | 3564 | |
| 6 | 2881 | |
| 3 | 2647 | |
| 4 | 2621 | |
| 9 | 2593 | |
| 5 | 2450 | |
| 8 | 2018 | |
| 7 | 1997 | |
| Other values (8) | 2763 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 32145 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 4363 | |
| 1 | 4248 | |
| 2 | 3564 | |
| 6 | 2881 | |
| 3 | 2647 | |
| 4 | 2621 | |
| 9 | 2593 | |
| 5 | 2450 | |
| 8 | 2018 | |
| 7 | 1997 | |
| Other values (8) | 2763 |
age_at_diagnosis
Real number (ℝ)
Missing 
| Distinct | 78 |
|---|---|
| Distinct (%) | 2.1% |
| Missing | 182 |
| Missing (%) | 4.7% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 61.985401 |
| Minimum | 7 |
|---|---|
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.4 KiB |
Quantile statistics
| Minimum | 7 |
|---|---|
| 5-th percentile | 37 |
| Q1 | 53 |
| median | 63 |
| Q3 | 72 |
| 95-th percentile | 84 |
| Maximum | 90 |
| Range | 83 |
| Interquartile range (IQR) | 19 |
Descriptive statistics
| Standard deviation | 13.981321 |
|---|---|
| Coefficient of variation (CV) | 0.22555829 |
| Kurtosis | 0.031991203 |
| Mean | 61.985401 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | -0.43409614 |
| Sum | 229284 |
| Variance | 195.47734 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 66 | 123 | 3.2% |
| 67 | 120 | 3.1% |
| 60 | 115 | 3.0% |
| 65 | 112 | 2.9% |
| 71 | 110 | 2.8% |
| 64 | 107 | 2.8% |
| 72 | 106 | 2.7% |
| 70 | 106 | 2.7% |
| 59 | 105 | 2.7% |
| 69 | 101 | 2.6% |
| Other values (68) | 2594 | |
| (Missing) | 182 | 4.7% |
| Value | Count | Frequency (%) |
| 7 | 1 | < 0.1% |
| 11 | 1 | < 0.1% |
| 12 | 2 | 0.1% |
| 14 | 1 | < 0.1% |
| 17 | 1 | < 0.1% |
| 18 | 2 | 0.1% |
| 19 | 5 | |
| 20 | 1 | < 0.1% |
| 21 | 1 | < 0.1% |
| 22 | 5 |
| Value | Count | Frequency (%) |
| 90 | 43 | |
| 89 | 11 | 0.3% |
| 88 | 20 | |
| 87 | 22 | |
| 86 | 25 | |
| 85 | 34 | |
| 84 | 39 | |
| 83 | 38 | |
| 82 | 29 | |
| 81 | 40 |
stage_at_diagnosis
Categorical
High correlation  Missing 
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 182 |
| Missing (%) | 4.7% |
| Memory size | 247.5 KiB |
| Localized | |
|---|---|
| Unknown | |
| Metastatic | |
| Regional | |
| metastasis | 72 |
Length
| Max length | 10 |
|---|---|
| Median length | 9 |
| Mean length | 8.3254934 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unknown |
|---|---|
| 2nd row | Unknown |
| 3rd row | Unknown |
| 4th row | Unknown |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Localized | 1381 | |
| Unknown | 1355 | |
| Metastatic | 517 | 13.3% |
| Regional | 374 | 9.6% |
| metastasis | 72 | 1.9% |
| (Missing) | 182 | 4.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| localized | 1381 | |
| unknown | 1355 | |
| metastatic | 517 | 14.0% |
| regional | 374 | 10.1% |
| metastasis | 72 | 1.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 4439 | |
| o | 3110 | |
| a | 2933 | 9.5% |
| e | 2344 | 7.6% |
| i | 2344 | 7.6% |
| c | 1898 | 6.2% |
| l | 1755 | 5.7% |
| t | 1695 | 5.5% |
| L | 1381 | 4.5% |
| d | 1381 | 4.5% |
| Other values (9) | 7516 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 30796 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 4439 | |
| o | 3110 | |
| a | 2933 | 9.5% |
| e | 2344 | 7.6% |
| i | 2344 | 7.6% |
| c | 1898 | 6.2% |
| l | 1755 | 5.7% |
| t | 1695 | 5.5% |
| L | 1381 | 4.5% |
| d | 1381 | 4.5% |
| Other values (9) | 7516 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 30796 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 4439 | |
| o | 3110 | |
| a | 2933 | 9.5% |
| e | 2344 | 7.6% |
| i | 2344 | 7.6% |
| c | 1898 | 6.2% |
| l | 1755 | 5.7% |
| t | 1695 | 5.5% |
| L | 1381 | 4.5% |
| d | 1381 | 4.5% |
| Other values (9) | 7516 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 30796 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 4439 | |
| o | 3110 | |
| a | 2933 | 9.5% |
| e | 2344 | 7.6% |
| i | 2344 | 7.6% |
| c | 1898 | 6.2% |
| l | 1755 | 5.7% |
| t | 1695 | 5.5% |
| L | 1381 | 4.5% |
| d | 1381 | 4.5% |
| Other values (9) | 7516 |
tumor_size
Real number (ℝ)
High correlation  Missing 
| Distinct | 145 |
|---|---|
| Distinct (%) | 23.6% |
| Missing | 3267 |
| Missing (%) | 84.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.5754072 |
| Minimum | 0.7 |
|---|---|
| Maximum | 42 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.4 KiB |
Quantile statistics
| Minimum | 0.7 |
|---|---|
| 5-th percentile | 2.4 |
| Q1 | 5.125 |
| median | 8.5 |
| Q3 | 13 |
| 95-th percentile | 20 |
| Maximum | 42 |
| Range | 41.3 |
| Interquartile range (IQR) | 7.875 |
Descriptive statistics
| Standard deviation | 5.8640107 |
|---|---|
| Coefficient of variation (CV) | 0.61240327 |
| Kurtosis | 2.4200107 |
| Mean | 9.5754072 |
| Median Absolute Deviation (MAD) | 3.5 |
| Skewness | 1.2060716 |
| Sum | 5879.3 |
| Variance | 34.386621 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 11 | 26 | 0.7% |
| 15 | 26 | 0.7% |
| 8 | 25 | 0.6% |
| 10 | 20 | 0.5% |
| 12 | 19 | 0.5% |
| 6.5 | 18 | 0.5% |
| 7 | 16 | 0.4% |
| 14 | 15 | 0.4% |
| 6 | 12 | 0.3% |
| 5 | 11 | 0.3% |
| Other values (135) | 426 | 11.0% |
| (Missing) | 3267 |
| Value | Count | Frequency (%) |
| 0.7 | 1 | < 0.1% |
| 1 | 1 | < 0.1% |
| 1.2 | 1 | < 0.1% |
| 1.4 | 1 | < 0.1% |
| 1.5 | 2 | 0.1% |
| 1.7 | 3 | |
| 1.8 | 3 | |
| 1.9 | 2 | 0.1% |
| 2 | 6 | |
| 2.1 | 2 | 0.1% |
| Value | Count | Frequency (%) |
| 42 | 1 | < 0.1% |
| 36.5 | 1 | < 0.1% |
| 34 | 1 | < 0.1% |
| 29.5 | 1 | < 0.1% |
| 27.7 | 1 | < 0.1% |
| 26 | 1 | < 0.1% |
| 25 | 8 | |
| 24 | 9 | |
| 21 | 5 | |
| 20.4 | 1 | < 0.1% |
mitotic_rate
Real number (ℝ)
High correlation  Missing  Zeros 
| Distinct | 63 |
|---|---|
| Distinct (%) | 10.8% |
| Missing | 3300 |
| Missing (%) | 85.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 17.58864 |
| Minimum | 0 |
|---|---|
| Maximum | 175 |
| Zeros | 43 |
| Zeros (%) | 1.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.4 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 2 |
| median | 7 |
| Q3 | 25 |
| 95-th percentile | 55 |
| Maximum | 175 |
| Range | 175 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 23.43541 |
|---|---|
| Coefficient of variation (CV) | 1.3324174 |
| Kurtosis | 8.0323429 |
| Mean | 17.58864 |
| Median Absolute Deviation (MAD) | 6 |
| Skewness | 2.4506504 |
| Sum | 10219 |
| Variance | 549.21842 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2 | 60 | 1.5% |
| 1 | 55 | 1.4% |
| 0 | 43 | 1.1% |
| 5 | 38 | 1.0% |
| 10 | 29 | 0.7% |
| 4 | 27 | 0.7% |
| 50 | 27 | 0.7% |
| 3 | 23 | 0.6% |
| 20 | 23 | 0.6% |
| 7 | 23 | 0.6% |
| Other values (53) | 233 | 6.0% |
| (Missing) | 3300 |
| Value | Count | Frequency (%) |
| 0 | 43 | |
| 1 | 55 | |
| 2 | 60 | |
| 3 | 23 | 0.6% |
| 4 | 27 | |
| 5 | 38 | |
| 6 | 22 | 0.6% |
| 7 | 23 | 0.6% |
| 8 | 15 | 0.4% |
| 9 | 5 | 0.1% |
| Value | Count | Frequency (%) |
| 175 | 1 | < 0.1% |
| 145 | 1 | < 0.1% |
| 130 | 1 | < 0.1% |
| 125 | 1 | < 0.1% |
| 112 | 1 | < 0.1% |
| 104 | 1 | < 0.1% |
| 102 | 1 | < 0.1% |
| 100 | 4 | |
| 95 | 1 | < 0.1% |
| 90 | 7 |
treatment
Categorical
High correlation  Imbalance 
| Distinct | 28 |
|---|---|
| Distinct (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 243.8 KiB |
| SURGERY | |
|---|---|
| IMATINIB | |
| OTHER | |
| UNKNOWN | 90 |
| SUNITINIB | 65 |
| Other values (23) | 193 |
Length
| Max length | 31 |
|---|---|
| Median length | 7 |
| Mean length | 7.2821438 |
| Min length | 4 |
Unique
| Unique | 9 ? |
|---|---|
| Unique (%) | 0.2% |
Sample
| 1st row | IMATINIB |
|---|---|
| 2nd row | IMATINIB |
| 3rd row | IMATINIB |
| 4th row | IMATINIB |
| 5th row | IMATINIB |
Common Values
| Value | Count | Frequency (%) |
| SURGERY | 2429 | |
| IMATINIB | 621 | 16.0% |
| OTHER | 483 | 12.4% |
| UNKNOWN | 90 | 2.3% |
| SUNITINIB | 65 | 1.7% |
| CLINICAL_TRIAL | 50 | 1.3% |
| IMATINIB + SUNITINIB | 39 | 1.0% |
| REGORAFENIB | 32 | 0.8% |
| SORAFENIB | 15 | 0.4% |
| PAZOPANIB | 13 | 0.3% |
| Other values (18) | 44 | 1.1% |
Length
| Value | Count | Frequency (%) |
| surgery | 2429 | |
| imatinib | 669 | 16.8% |
| other | 483 | 12.1% |
| sunitinib | 105 | 2.6% |
| unknown | 97 | 2.4% |
| 51 | 1.3% | |
| clinical_trial | 50 | 1.3% |
| regorafenib | 32 | 0.8% |
| sorafenib | 17 | 0.4% |
| pazopanib | 14 | 0.4% |
| Other values (14) | 36 | 0.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| R | 5500 | |
| E | 3023 | |
| U | 2642 | |
| I | 2603 | |
| S | 2558 | |
| G | 2461 | |
| Y | 2435 | |
| T | 1347 | 4.8% |
| N | 1334 | 4.7% |
| A | 871 | 3.1% |
| Other values (17) | 3488 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 28262 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| R | 5500 | |
| E | 3023 | |
| U | 2642 | |
| I | 2603 | |
| S | 2558 | |
| G | 2461 | |
| Y | 2435 | |
| T | 1347 | 4.8% |
| N | 1334 | 4.7% |
| A | 871 | 3.1% |
| Other values (17) | 3488 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 28262 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| R | 5500 | |
| E | 3023 | |
| U | 2642 | |
| I | 2603 | |
| S | 2558 | |
| G | 2461 | |
| Y | 2435 | |
| T | 1347 | 4.8% |
| N | 1334 | 4.7% |
| A | 871 | 3.1% |
| Other values (17) | 3488 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 28262 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| R | 5500 | |
| E | 3023 | |
| U | 2642 | |
| I | 2603 | |
| S | 2558 | |
| G | 2461 | |
| Y | 2435 | |
| T | 1347 | 4.8% |
| N | 1334 | 4.7% |
| A | 871 | 3.1% |
| Other values (17) | 3488 |
treatment_response
Categorical
High correlation  Missing 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 2541 |
| Missing (%) | 65.5% |
| Memory size | 238.7 KiB |
| UNKNOWN | |
|---|---|
| NR | |
| PR | |
| CR | |
| SD |
Length
| Max length | 7 |
|---|---|
| Median length | 2 |
| Mean length | 3.9589552 |
| Min length | 2 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | PR |
|---|---|
| 2nd row | PR |
| 3rd row | NR |
| 4th row | PR |
| 5th row | CR |
Common Values
| Value | Count | Frequency (%) |
| UNKNOWN | 525 | 13.5% |
| NR | 494 | 12.7% |
| PR | 133 | 3.4% |
| CR | 83 | 2.1% |
| SD | 63 | 1.6% |
| NE | 42 | 1.1% |
| (Missing) | 2541 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unknown | 525 | |
| nr | 494 | |
| pr | 133 | 9.9% |
| cr | 83 | 6.2% |
| sd | 63 | 4.7% |
| ne | 42 | 3.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| N | 2111 | |
| R | 710 | 13.4% |
| U | 525 | 9.9% |
| K | 525 | 9.9% |
| O | 525 | 9.9% |
| W | 525 | 9.9% |
| P | 133 | 2.5% |
| C | 83 | 1.6% |
| S | 63 | 1.2% |
| D | 63 | 1.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5305 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| N | 2111 | |
| R | 710 | 13.4% |
| U | 525 | 9.9% |
| K | 525 | 9.9% |
| O | 525 | 9.9% |
| W | 525 | 9.9% |
| P | 133 | 2.5% |
| C | 83 | 1.6% |
| S | 63 | 1.2% |
| D | 63 | 1.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5305 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| N | 2111 | |
| R | 710 | 13.4% |
| U | 525 | 9.9% |
| K | 525 | 9.9% |
| O | 525 | 9.9% |
| W | 525 | 9.9% |
| P | 133 | 2.5% |
| C | 83 | 1.6% |
| S | 63 | 1.2% |
| D | 63 | 1.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5305 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| N | 2111 | |
| R | 710 | 13.4% |
| U | 525 | 9.9% |
| K | 525 | 9.9% |
| O | 525 | 9.9% |
| W | 525 | 9.9% |
| P | 133 | 2.5% |
| C | 83 | 1.6% |
| S | 63 | 1.2% |
| D | 63 | 1.2% |
primary_site
Categorical
Imbalance 
| Distinct | 24 |
|---|---|
| Distinct (%) | 0.6% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 259.0 KiB |
| Stomach | |
|---|---|
| Small Intestine | |
| Soft Tissue | 101 |
| Colon And Rectum (Excluding Appendix) | 98 |
| Abdomen/Intraabdominal | 88 |
| Other values (19) |
Length
| Max length | 37 |
|---|---|
| Median length | 7 |
| Mean length | 11.311002 |
| Min length | 4 |
Unique
| Unique | 4 ? |
|---|---|
| Unique (%) | 0.1% |
Sample
| 1st row | Stomach |
|---|---|
| 2nd row | Stomach |
| 3rd row | Small Intestine |
| 4th row | Small Intestine |
| 5th row | Small Intestine |
Common Values
| Value | Count | Frequency (%) |
| Stomach | 2153 | |
| Small Intestine | 1071 | |
| Soft Tissue | 101 | 2.6% |
| Colon And Rectum (Excluding Appendix) | 98 | 2.5% |
| Abdomen/Intraabdominal | 88 | 2.3% |
| Digestive Other | 76 | 2.0% |
| Colon/Rectum | 63 | 1.6% |
| GI Tract (Indeterminate) | 61 | 1.6% |
| Retroperitoneum | 55 | 1.4% |
| Retroperitoneum And Peritoneum | 29 | 0.7% |
| Other values (14) | 86 | 2.2% |
Length
| Value | Count | Frequency (%) |
| stomach | 2153 | |
| small | 1071 | |
| intestine | 1071 | |
| and | 129 | 2.3% |
| appendix | 103 | 1.8% |
| tissue | 101 | 1.8% |
| soft | 101 | 1.8% |
| colon | 98 | 1.7% |
| rectum | 98 | 1.7% |
| excluding | 98 | 1.7% |
| Other values (29) | 701 | 12.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 5214 | |
| m | 3749 | 8.5% |
| a | 3713 | 8.5% |
| e | 3393 | 7.7% |
| S | 3325 | 7.6% |
| n | 3193 | 7.3% |
| o | 3024 | 6.9% |
| l | 2542 | 5.8% |
| c | 2499 | 5.7% |
| h | 2258 | 5.1% |
| Other values (35) | 10988 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 43898 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| t | 5214 | |
| m | 3749 | 8.5% |
| a | 3713 | 8.5% |
| e | 3393 | 7.7% |
| S | 3325 | 7.6% |
| n | 3193 | 7.3% |
| o | 3024 | 6.9% |
| l | 2542 | 5.8% |
| c | 2499 | 5.7% |
| h | 2258 | 5.1% |
| Other values (35) | 10988 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 43898 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| t | 5214 | |
| m | 3749 | 8.5% |
| a | 3713 | 8.5% |
| e | 3393 | 7.7% |
| S | 3325 | 7.6% |
| n | 3193 | 7.3% |
| o | 3024 | 6.9% |
| l | 2542 | 5.8% |
| c | 2499 | 5.7% |
| h | 2258 | 5.1% |
| Other values (35) | 10988 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 43898 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| t | 5214 | |
| m | 3749 | 8.5% |
| a | 3713 | 8.5% |
| e | 3393 | 7.7% |
| S | 3325 | 7.6% |
| n | 3193 | 7.3% |
| o | 3024 | 6.9% |
| l | 2542 | 5.8% |
| c | 2499 | 5.7% |
| h | 2258 | 5.1% |
| Other values (35) | 10988 |
sample_type
Categorical
High correlation 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 245.9 KiB |
| Unknown | |
|---|---|
| Metastasis | |
| Primary | |
| Local Recurrence | 144 |
Length
| Max length | 16 |
|---|---|
| Median length | 7 |
| Mean length | 7.8433393 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Primary |
|---|---|
| 2nd row | Primary |
| 3rd row | Metastasis |
| 4th row | Metastasis |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 2532 | |
| Metastasis | 659 | 17.0% |
| Primary | 546 | 14.1% |
| Local Recurrence | 144 | 3.7% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| unknown | 2532 | |
| metastasis | 659 | 16.4% |
| primary | 546 | 13.6% |
| local | 144 | 3.6% |
| recurrence | 144 | 3.6% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 7740 | |
| o | 2676 | 8.8% |
| U | 2532 | 8.3% |
| k | 2532 | 8.3% |
| w | 2532 | 8.3% |
| a | 2008 | 6.6% |
| s | 1977 | 6.5% |
| r | 1380 | 4.5% |
| t | 1318 | 4.3% |
| i | 1205 | 4.0% |
| Other values (11) | 4540 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 30440 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 7740 | |
| o | 2676 | 8.8% |
| U | 2532 | 8.3% |
| k | 2532 | 8.3% |
| w | 2532 | 8.3% |
| a | 2008 | 6.6% |
| s | 1977 | 6.5% |
| r | 1380 | 4.5% |
| t | 1318 | 4.3% |
| i | 1205 | 4.0% |
| Other values (11) | 4540 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 30440 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 7740 | |
| o | 2676 | 8.8% |
| U | 2532 | 8.3% |
| k | 2532 | 8.3% |
| w | 2532 | 8.3% |
| a | 2008 | 6.6% |
| s | 1977 | 6.5% |
| r | 1380 | 4.5% |
| t | 1318 | 4.3% |
| i | 1205 | 4.0% |
| Other values (11) | 4540 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 30440 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 7740 | |
| o | 2676 | 8.8% |
| U | 2532 | 8.3% |
| k | 2532 | 8.3% |
| w | 2532 | 8.3% |
| a | 2008 | 6.6% |
| s | 1977 | 6.5% |
| r | 1380 | 4.5% |
| t | 1318 | 4.3% |
| i | 1205 | 4.0% |
| Other values (11) | 4540 |
race
Categorical
Missing 
| Distinct | 9 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 637 |
| Missing (%) | 16.4% |
| Memory size | 261.0 KiB |
| White | |
|---|---|
| Black | |
| Other (American Indian/AK Native, Asian/Pacific Islander) | |
| Black or African American | 99 |
| Unknown | 96 |
| Other values (4) | 74 |
Length
| Max length | 57 |
|---|---|
| Median length | 5 |
| Mean length | 12.78021 |
| Min length | 5 |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | White |
|---|---|
| 2nd row | White |
| 3rd row | White |
| 4th row | White |
| 5th row | White |
Common Values
| Value | Count | Frequency (%) |
| White | 2067 | |
| Black | 465 | 12.0% |
| Other (American Indian/AK Native, Asian/Pacific Islander) | 443 | 11.4% |
| Black or African American | 99 | 2.6% |
| Unknown | 96 | 2.5% |
| Asian | 55 | 1.4% |
| Other | 15 | 0.4% |
| Not Provided | 3 | 0.1% |
| Native American | 1 | < 0.1% |
| (Missing) | 637 | 16.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| white | 2067 | |
| black | 564 | 9.8% |
| american | 543 | 9.4% |
| other | 458 | 8.0% |
| native | 444 | 7.7% |
| indian/ak | 443 | 7.7% |
| asian/pacific | 443 | 7.7% |
| islander | 443 | 7.7% |
| or | 99 | 1.7% |
| african | 99 | 1.7% |
| Other values (4) | 157 | 2.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| i | 4983 | 12.0% |
| e | 3958 | 9.5% |
| a | 3477 | 8.4% |
| t | 2972 | 7.2% |
| n | 2757 | 6.6% |
| h | 2525 | 6.1% |
| 2516 | 6.1% | |
| c | 2092 | 5.0% |
| W | 2067 | 5.0% |
| r | 1645 | 4.0% |
| Other values (21) | 12467 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 41459 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| i | 4983 | 12.0% |
| e | 3958 | 9.5% |
| a | 3477 | 8.4% |
| t | 2972 | 7.2% |
| n | 2757 | 6.6% |
| h | 2525 | 6.1% |
| 2516 | 6.1% | |
| c | 2092 | 5.0% |
| W | 2067 | 5.0% |
| r | 1645 | 4.0% |
| Other values (21) | 12467 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 41459 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| i | 4983 | 12.0% |
| e | 3958 | 9.5% |
| a | 3477 | 8.4% |
| t | 2972 | 7.2% |
| n | 2757 | 6.6% |
| h | 2525 | 6.1% |
| 2516 | 6.1% | |
| c | 2092 | 5.0% |
| W | 2067 | 5.0% |
| r | 1645 | 4.0% |
| Other values (21) | 12467 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 41459 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| i | 4983 | 12.0% |
| e | 3958 | 9.5% |
| a | 3477 | 8.4% |
| t | 2972 | 7.2% |
| n | 2757 | 6.6% |
| h | 2525 | 6.1% |
| 2516 | 6.1% | |
| c | 2092 | 5.0% |
| W | 2067 | 5.0% |
| r | 1645 | 4.0% |
| Other values (21) | 12467 |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.9468524 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Male |
|---|---|
| 2nd row | Male |
| 3rd row | Female |
| 4th row | Female |
| 5th row | Female |
Common Values
| Value | Count | Frequency (%) |
| Male | 2041 | |
| Female | 1835 | |
| (Missing) | 5 | 0.1% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 2041 | |
| female | 1835 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 5711 | |
| a | 3876 | |
| l | 3876 | |
| M | 2041 | 10.6% |
| F | 1835 | 9.6% |
| m | 1835 | 9.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 19174 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 5711 | |
| a | 3876 | |
| l | 3876 | |
| M | 2041 | 10.6% |
| F | 1835 | 9.6% |
| m | 1835 | 9.6% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 19174 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 5711 | |
| a | 3876 | |
| l | 3876 | |
| M | 2041 | 10.6% |
| F | 1835 | 9.6% |
| m | 1835 | 9.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 19174 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 5711 | |
| a | 3876 | |
| l | 3876 | |
| M | 2041 | 10.6% |
| F | 1835 | 9.6% |
| m | 1835 | 9.6% |
metastatic_site
Categorical
High correlation  Imbalance  Missing 
| Distinct | 32 |
|---|---|
| Distinct (%) | 4.0% |
| Missing | 3085 |
| Missing (%) | 79.5% |
| Memory size | 245.9 KiB |
| Not Applicable | |
|---|---|
| Liver | |
| Mesentery | 15 |
| Peritoneum | 15 |
| Abdomen | 13 |
| Other values (27) |
Length
| Max length | 18 |
|---|---|
| Median length | 14 |
| Mean length | 11.182161 |
| Min length | 4 |
Unique
| Unique | 8 ? |
|---|---|
| Unique (%) | 1.0% |
Sample
| 1st row | Not Applicable |
|---|---|
| 2nd row | Not Applicable |
| 3rd row | Not Applicable |
| 4th row | Not Applicable |
| 5th row | Liver |
Common Values
| Value | Count | Frequency (%) |
| Not Applicable | 483 | 12.4% |
| Liver | 150 | 3.9% |
| Mesentery | 15 | 0.4% |
| Peritoneum | 15 | 0.4% |
| Abdomen | 13 | 0.3% |
| Pelvis | 13 | 0.3% |
| Small Bowel | 13 | 0.3% |
| Omentum | 11 | 0.3% |
| Spleen | 9 | 0.2% |
| Skin | 9 | 0.2% |
| Other values (22) | 65 | 1.7% |
| (Missing) | 3085 |
Length
| Value | Count | Frequency (%) |
| not | 486 | |
| applicable | 483 | |
| liver | 150 | 11.3% |
| mesentery | 15 | 1.1% |
| peritoneum | 15 | 1.1% |
| pelvis | 15 | 1.1% |
| abdomen | 13 | 1.0% |
| small | 13 | 1.0% |
| bowel | 13 | 1.0% |
| omentum | 11 | 0.8% |
| Other values (29) | 115 | 8.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| l | 1079 | |
| p | 987 | |
| e | 824 | 9.3% |
| i | 702 | 7.9% |
| o | 572 | 6.4% |
| a | 552 | 6.2% |
| t | 546 | 6.1% |
| 533 | 6.0% | |
| b | 507 | 5.7% |
| A | 504 | 5.7% |
| Other values (27) | 2095 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 8901 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| l | 1079 | |
| p | 987 | |
| e | 824 | 9.3% |
| i | 702 | 7.9% |
| o | 572 | 6.4% |
| a | 552 | 6.2% |
| t | 546 | 6.1% |
| 533 | 6.0% | |
| b | 507 | 5.7% |
| A | 504 | 5.7% |
| Other values (27) | 2095 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 8901 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| l | 1079 | |
| p | 987 | |
| e | 824 | 9.3% |
| i | 702 | 7.9% |
| o | 572 | 6.4% |
| a | 552 | 6.2% |
| t | 546 | 6.1% |
| 533 | 6.0% | |
| b | 507 | 5.7% |
| A | 504 | 5.7% |
| Other values (27) | 2095 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 8901 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| l | 1079 | |
| p | 987 | |
| e | 824 | 9.3% |
| i | 702 | 7.9% |
| o | 572 | 6.4% |
| a | 552 | 6.2% |
| t | 546 | 6.1% |
| 533 | 6.0% | |
| b | 507 | 5.7% |
| A | 504 | 5.7% |
| Other values (27) | 2095 |
tumor_purity
Real number (ℝ)
High correlation  Missing 
| Distinct | 14 |
|---|---|
| Distinct (%) | 1.7% |
| Missing | 3037 |
| Missing (%) | 78.3% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 66.156398 |
| Minimum | 10 |
|---|---|
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.4 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 30 |
| Q1 | 60 |
| median | 70 |
| Q3 | 80 |
| 95-th percentile | 90 |
| Maximum | 90 |
| Range | 80 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 18.460607 |
|---|---|
| Coefficient of variation (CV) | 0.27904492 |
| Kurtosis | 0.20032513 |
| Mean | 66.156398 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | -0.85498788 |
| Sum | 55836 |
| Variance | 340.79402 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 80 | 221 | 5.7% |
| 70 | 168 | 4.3% |
| 60 | 150 | 3.9% |
| 90 | 104 | 2.7% |
| 50 | 71 | 1.8% |
| 40 | 50 | 1.3% |
| 30 | 44 | 1.1% |
| 20 | 13 | 0.3% |
| 10 | 9 | 0.2% |
| 85 | 6 | 0.2% |
| Other values (4) | 8 | 0.2% |
| (Missing) | 3037 |
| Value | Count | Frequency (%) |
| 10 | 9 | 0.2% |
| 15 | 2 | 0.1% |
| 20 | 13 | 0.3% |
| 30 | 44 | 1.1% |
| 35 | 4 | 0.1% |
| 40 | 50 | 1.3% |
| 50 | 71 | |
| 60 | 150 | |
| 63 | 1 | < 0.1% |
| 70 | 168 |
| Value | Count | Frequency (%) |
| 90 | 104 | |
| 85 | 6 | 0.2% |
| 80 | 221 | |
| 73 | 1 | < 0.1% |
| 70 | 168 | |
| 63 | 1 | < 0.1% |
| 60 | 150 | |
| 50 | 71 | 1.8% |
| 40 | 50 | 1.3% |
| 35 | 4 | 0.1% |
sample_coverage
Real number (ℝ)
High correlation  Missing 
| Distinct | 406 |
|---|---|
| Distinct (%) | 51.0% |
| Missing | 3085 |
| Missing (%) | 79.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 670.40704 |
| Minimum | 106 |
|---|---|
| Maximum | 1270 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 30.4 KiB |
Quantile statistics
| Minimum | 106 |
|---|---|
| 5-th percentile | 330.5 |
| Q1 | 520 |
| median | 672.5 |
| Q3 | 812 |
| 95-th percentile | 1023 |
| Maximum | 1270 |
| Range | 1164 |
| Interquartile range (IQR) | 292 |
Descriptive statistics
| Standard deviation | 211.57931 |
|---|---|
| Coefficient of variation (CV) | 0.31559828 |
| Kurtosis | -0.24543072 |
| Mean | 670.40704 |
| Median Absolute Deviation (MAD) | 146.5 |
| Skewness | 0.029028888 |
| Sum | 533644 |
| Variance | 44765.804 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1132 | 10 | 0.3% |
| 674 | 9 | 0.2% |
| 1023 | 9 | 0.2% |
| 920 | 7 | 0.2% |
| 583 | 7 | 0.2% |
| 808 | 7 | 0.2% |
| 682 | 7 | 0.2% |
| 780 | 7 | 0.2% |
| 434 | 7 | 0.2% |
| 677 | 7 | 0.2% |
| Other values (396) | 719 | 18.5% |
| (Missing) | 3085 |
| Value | Count | Frequency (%) |
| 106 | 2 | |
| 148 | 1 | < 0.1% |
| 152 | 1 | < 0.1% |
| 172 | 2 | |
| 176 | 2 | |
| 182 | 4 | |
| 184 | 1 | < 0.1% |
| 189 | 1 | < 0.1% |
| 205 | 1 | < 0.1% |
| 206 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1270 | 1 | < 0.1% |
| 1243 | 1 | < 0.1% |
| 1225 | 1 | < 0.1% |
| 1152 | 1 | < 0.1% |
| 1135 | 1 | < 0.1% |
| 1132 | 10 | |
| 1108 | 1 | < 0.1% |
| 1107 | 2 | 0.1% |
| 1085 | 3 | 0.1% |
| 1080 | 1 | < 0.1% |
os_months
Text
Missing 
| Distinct | 539 |
|---|---|
| Distinct (%) | 17.0% |
| Missing | 710 |
| Missing (%) | 18.3% |
| Memory size | 212.5 KiB |
Length
| Max length | 7 |
|---|---|
| Median length | 4 |
| Mean length | 4.4203721 |
| Min length | 3 |
Unique
| Unique | 367 ? |
|---|---|
| Unique (%) | 11.6% |
Sample
| 1st row | 11.079 |
|---|---|
| 2nd row | 11.079 |
| 3rd row | 11.079 |
| 4th row | 11.079 |
| 5th row | 11.079 |
| Value | Count | Frequency (%) |
| 0000 | 126 | 4.0% |
| 0001 | 89 | 2.8% |
| 0003 | 85 | 2.7% |
| 0004 | 80 | 2.5% |
| 0006 | 78 | 2.5% |
| 0002 | 77 | 2.4% |
| 0005 | 76 | 2.4% |
| 0009 | 76 | 2.4% |
| 0010 | 74 | 2.3% |
| 0019 | 73 | 2.3% |
| Other values (529) | 2337 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 6226 | |
| 1 | 1377 | 9.8% |
| 2 | 1001 | 7.1% |
| 4 | 844 | 6.0% |
| 3 | 808 | 5.8% |
| . | 742 | 5.3% |
| 5 | 684 | 4.9% |
| 7 | 608 | 4.3% |
| 6 | 580 | 4.1% |
| 9 | 572 | 4.1% |
| Other values (6) | 575 | 4.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 14017 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 6226 | |
| 1 | 1377 | 9.8% |
| 2 | 1001 | 7.1% |
| 4 | 844 | 6.0% |
| 3 | 808 | 5.8% |
| . | 742 | 5.3% |
| 5 | 684 | 4.9% |
| 7 | 608 | 4.3% |
| 6 | 580 | 4.1% |
| 9 | 572 | 4.1% |
| Other values (6) | 575 | 4.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 14017 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 6226 | |
| 1 | 1377 | 9.8% |
| 2 | 1001 | 7.1% |
| 4 | 844 | 6.0% |
| 3 | 808 | 5.8% |
| . | 742 | 5.3% |
| 5 | 684 | 4.9% |
| 7 | 608 | 4.3% |
| 6 | 580 | 4.1% |
| 9 | 572 | 4.1% |
| Other values (6) | 575 | 4.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 14017 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 6226 | |
| 1 | 1377 | 9.8% |
| 2 | 1001 | 7.1% |
| 4 | 844 | 6.0% |
| 3 | 808 | 5.8% |
| . | 742 | 5.3% |
| 5 | 684 | 4.9% |
| 7 | 608 | 4.3% |
| 6 | 580 | 4.1% |
| 9 | 572 | 4.1% |
| Other values (6) | 575 | 4.1% |
treatment_start
Text
| Distinct | 237 |
|---|---|
| Distinct (%) | 6.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 261.0 KiB |
Length
| Max length | 12 |
|---|---|
| Median length | 12 |
| Mean length | 11.832002 |
| Min length | 10 |
Unique
| Unique | 168 ? |
|---|---|
| Unique (%) | 4.3% |
Sample
| 1st row | MISSING_DATE |
|---|---|
| 2nd row | MISSING_DATE |
| 3rd row | MISSING_DATE |
| 4th row | MISSING_DATE |
| 5th row | MISSING_DATE |
| Value | Count | Frequency (%) |
| missing_date | 3555 | |
| 1899-12-30 | 8 | 0.2% |
| 1900-01-18 | 5 | 0.1% |
| 1900-01-13 | 4 | 0.1% |
| 1905-05-29 | 4 | 0.1% |
| 1899-12-29 | 4 | 0.1% |
| 1900-01-17 | 4 | 0.1% |
| 1900-01-21 | 3 | 0.1% |
| 1905-06-21 | 3 | 0.1% |
| 1900-01-09 | 3 | 0.1% |
| Other values (227) | 288 | 7.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| I | 7110 | |
| S | 7110 | |
| M | 3555 | |
| N | 3555 | |
| G | 3555 | |
| _ | 3555 | |
| D | 3555 | |
| A | 3555 | |
| T | 3555 | |
| E | 3555 | |
| Other values (11) | 3260 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 45920 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| I | 7110 | |
| S | 7110 | |
| M | 3555 | |
| N | 3555 | |
| G | 3555 | |
| _ | 3555 | |
| D | 3555 | |
| A | 3555 | |
| T | 3555 | |
| E | 3555 | |
| Other values (11) | 3260 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 45920 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| I | 7110 | |
| S | 7110 | |
| M | 3555 | |
| N | 3555 | |
| G | 3555 | |
| _ | 3555 | |
| D | 3555 | |
| A | 3555 | |
| T | 3555 | |
| E | 3555 | |
| Other values (11) | 3260 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 45920 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| I | 7110 | |
| S | 7110 | |
| M | 3555 | |
| N | 3555 | |
| G | 3555 | |
| _ | 3555 | |
| D | 3555 | |
| A | 3555 | |
| T | 3555 | |
| E | 3555 | |
| Other values (11) | 3260 |
os_status
Categorical
High correlation  Missing 
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 666 |
| Missing (%) | 17.2% |
| Memory size | 245.8 KiB |
| DECEASED | |
|---|---|
| ALIVE | |
| DECEASED_NON_CANCER | 134 |
Length
| Max length | 19 |
|---|---|
| Median length | 8 |
| Mean length | 7.9807154 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | DECEASED |
|---|---|
| 2nd row | DECEASED |
| 3rd row | DECEASED |
| 4th row | DECEASED |
| 5th row | DECEASED |
Common Values
| Value | Count | Frequency (%) |
| DECEASED | 2569 | |
| ALIVE | 512 | 13.2% |
| DECEASED_NON_CANCER | 134 | 3.5% |
| (Missing) | 666 | 17.2% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| deceased | 2569 | |
| alive | 512 | 15.9% |
| deceased_non_cancer | 134 | 4.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 8755 | |
| D | 5406 | |
| A | 3349 | 13.1% |
| C | 2971 | 11.6% |
| S | 2703 | 10.5% |
| L | 512 | 2.0% |
| I | 512 | 2.0% |
| V | 512 | 2.0% |
| N | 402 | 1.6% |
| _ | 268 | 1.0% |
| Other values (2) | 268 | 1.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 25658 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| E | 8755 | |
| D | 5406 | |
| A | 3349 | 13.1% |
| C | 2971 | 11.6% |
| S | 2703 | 10.5% |
| L | 512 | 2.0% |
| I | 512 | 2.0% |
| V | 512 | 2.0% |
| N | 402 | 1.6% |
| _ | 268 | 1.0% |
| Other values (2) | 268 | 1.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 25658 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| E | 8755 | |
| D | 5406 | |
| A | 3349 | 13.1% |
| C | 2971 | 11.6% |
| S | 2703 | 10.5% |
| L | 512 | 2.0% |
| I | 512 | 2.0% |
| V | 512 | 2.0% |
| N | 402 | 1.6% |
| _ | 268 | 1.0% |
| Other values (2) | 268 | 1.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 25658 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| E | 8755 | |
| D | 5406 | |
| A | 3349 | 13.1% |
| C | 2971 | 11.6% |
| S | 2703 | 10.5% |
| L | 512 | 2.0% |
| I | 512 | 2.0% |
| V | 512 | 2.0% |
| N | 402 | 1.6% |
| _ | 268 | 1.0% |
| Other values (2) | 268 | 1.0% |
source
Categorical
High correlation 
| Distinct | 5 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 237.0 KiB |
| SEER | |
|---|---|
| CBioPortal | |
| COSMIC | |
| GDC | 74 |
| PDMR | 19 |
Length
| Max length | 10 |
|---|---|
| Median length | 4 |
| Mean length | 5.5016748 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | COSMIC |
|---|---|
| 2nd row | COSMIC |
| 3rd row | COSMIC |
| 4th row | COSMIC |
| 5th row | COSMIC |
Common Values
| Value | Count | Frequency (%) |
| SEER | 2429 | |
| CBioPortal | 796 | 20.5% |
| COSMIC | 563 | 14.5% |
| GDC | 74 | 1.9% |
| PDMR | 19 | 0.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| seer | 2429 | |
| cbioportal | 796 | 20.5% |
| cosmic | 563 | 14.5% |
| gdc | 74 | 1.9% |
| pdmr | 19 | 0.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| E | 4858 | |
| S | 2992 | |
| R | 2448 | |
| C | 1996 | |
| o | 1592 | 7.5% |
| P | 815 | 3.8% |
| B | 796 | 3.7% |
| i | 796 | 3.7% |
| r | 796 | 3.7% |
| t | 796 | 3.7% |
| Other values (7) | 3467 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 21352 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| E | 4858 | |
| S | 2992 | |
| R | 2448 | |
| C | 1996 | |
| o | 1592 | 7.5% |
| P | 815 | 3.8% |
| B | 796 | 3.7% |
| i | 796 | 3.7% |
| r | 796 | 3.7% |
| t | 796 | 3.7% |
| Other values (7) | 3467 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 21352 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| E | 4858 | |
| S | 2992 | |
| R | 2448 | |
| C | 1996 | |
| o | 1592 | 7.5% |
| P | 815 | 3.8% |
| B | 796 | 3.7% |
| i | 796 | 3.7% |
| r | 796 | 3.7% |
| t | 796 | 3.7% |
| Other values (7) | 3467 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 21352 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| E | 4858 | |
| S | 2992 | |
| R | 2448 | |
| C | 1996 | |
| o | 1592 | 7.5% |
| P | 815 | 3.8% |
| B | 796 | 3.7% |
| i | 796 | 3.7% |
| r | 796 | 3.7% |
| t | 796 | 3.7% |
| Other values (7) | 3467 |
tumor_grade
Categorical
High correlation 
| Distinct | 4 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 250.8 KiB |
| Unknown | |
|---|---|
| High grade | |
| Low grade | |
| Intermediate grade |
Length
| Max length | 18 |
|---|---|
| Median length | 10 |
| Mean length | 9.1267715 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Unknown |
|---|---|
| 2nd row | Unknown |
| 3rd row | Unknown |
| 4th row | Unknown |
| 5th row | Unknown |
Common Values
| Value | Count | Frequency (%) |
| Unknown | 1688 | |
| High grade | 1672 | |
| Low grade | 277 | 7.1% |
| Intermediate grade | 244 | 6.3% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| grade | 2193 | |
| unknown | 1688 | |
| high | 1672 | |
| low | 277 | 4.6% |
| intermediate | 244 | 4.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 5308 | |
| g | 3865 | |
| e | 2925 | 8.3% |
| d | 2437 | 6.9% |
| r | 2437 | 6.9% |
| a | 2437 | 6.9% |
| 2193 | 6.2% | |
| o | 1965 | 5.5% |
| w | 1965 | 5.5% |
| i | 1916 | 5.4% |
| Other values (8) | 7973 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 35421 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| n | 5308 | |
| g | 3865 | |
| e | 2925 | 8.3% |
| d | 2437 | 6.9% |
| r | 2437 | 6.9% |
| a | 2437 | 6.9% |
| 2193 | 6.2% | |
| o | 1965 | 5.5% |
| w | 1965 | 5.5% |
| i | 1916 | 5.4% |
| Other values (8) | 7973 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 35421 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| n | 5308 | |
| g | 3865 | |
| e | 2925 | 8.3% |
| d | 2437 | 6.9% |
| r | 2437 | 6.9% |
| a | 2437 | 6.9% |
| 2193 | 6.2% | |
| o | 1965 | 5.5% |
| w | 1965 | 5.5% |
| i | 1916 | 5.4% |
| Other values (8) | 7973 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 35421 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| n | 5308 | |
| g | 3865 | |
| e | 2925 | 8.3% |
| d | 2437 | 6.9% |
| r | 2437 | 6.9% |
| a | 2437 | 6.9% |
| 2193 | 6.2% | |
| o | 1965 | 5.5% |
| w | 1965 | 5.5% |
| i | 1916 | 5.4% |
| Other values (8) | 7973 |
mutated_genes
Text
Missing 
| Distinct | 283 |
|---|---|
| Distinct (%) | 21.2% |
| Missing | 2545 |
| Missing (%) | 65.6% |
| Memory size | 170.7 KiB |
Length
| Max length | 144 |
|---|---|
| Median length | 7 |
| Mean length | 12.741766 |
| Min length | 7 |
Unique
| Unique | 204 ? |
|---|---|
| Unique (%) | 15.3% |
Sample
| 1st row | ['KIT'] |
|---|---|
| 2nd row | ['KIT'] |
| 3rd row | ['KIT'] |
| 4th row | ['KIT'] |
| 5th row | ['KIT'] |
| Value | Count | Frequency (%) |
| kit | 1125 | |
| pdgfra | 91 | 4.2% |
| rb1 | 51 | 2.3% |
| tp53 | 45 | 2.1% |
| nf1 | 42 | 1.9% |
| max | 41 | 1.9% |
| setd2 | 38 | 1.7% |
| mga | 33 | 1.5% |
| braf | 30 | 1.4% |
| pten | 27 | 1.2% |
| Other values (221) | 669 |
Most occurring characters
| Value | Count | Frequency (%) |
| ' | 4384 | |
| T | 1502 | 8.8% |
| [ | 1336 | 7.8% |
| ] | 1336 | 7.8% |
| K | 1282 | 7.5% |
| I | 1209 | 7.1% |
| 856 | 5.0% | |
| , | 856 | 5.0% |
| A | 417 | 2.4% |
| R | 410 | 2.4% |
| Other values (32) | 3435 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 17023 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| ' | 4384 | |
| T | 1502 | 8.8% |
| [ | 1336 | 7.8% |
| ] | 1336 | 7.8% |
| K | 1282 | 7.5% |
| I | 1209 | 7.1% |
| 856 | 5.0% | |
| , | 856 | 5.0% |
| A | 417 | 2.4% |
| R | 410 | 2.4% |
| Other values (32) | 3435 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 17023 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| ' | 4384 | |
| T | 1502 | 8.8% |
| [ | 1336 | 7.8% |
| ] | 1336 | 7.8% |
| K | 1282 | 7.5% |
| I | 1209 | 7.1% |
| 856 | 5.0% | |
| , | 856 | 5.0% |
| A | 417 | 2.4% |
| R | 410 | 2.4% |
| Other values (32) | 3435 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 17023 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| ' | 4384 | |
| T | 1502 | 8.8% |
| [ | 1336 | 7.8% |
| ] | 1336 | 7.8% |
| K | 1282 | 7.5% |
| I | 1209 | 7.1% |
| 856 | 5.0% | |
| , | 856 | 5.0% |
| A | 417 | 2.4% |
| R | 410 | 2.4% |
| Other values (32) | 3435 |
Interactions
Correlations
| Unnamed: 0 | age_at_diagnosis | gender | metastatic_site | mitotic_rate | os_status | primary_site | race | sample_coverage | sample_type | source | stage_at_diagnosis | treatment | treatment_response | tumor_grade | tumor_purity | tumor_size | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Unnamed: 0 | 1.000 | 0.178 | 0.143 | 0.303 | -0.184 | 0.478 | 0.240 | 0.227 | -0.376 | 0.560 | 0.662 | 0.356 | 0.436 | 0.517 | 0.459 | -0.008 | -0.161 |
| age_at_diagnosis | 0.178 | 1.000 | 0.075 | 0.179 | -0.026 | 0.220 | 0.086 | 0.103 | -0.152 | 0.185 | 0.172 | 0.069 | 0.138 | 0.115 | 0.149 | 0.071 | -0.029 |
| gender | 0.143 | 0.075 | 1.000 | 0.216 | 0.119 | 0.031 | 0.113 | 0.085 | 0.146 | 0.113 | 0.142 | 0.112 | 0.158 | 0.125 | 0.141 | 0.000 | 0.132 |
| metastatic_site | 0.303 | 0.179 | 0.216 | 1.000 | 0.000 | 0.391 | 0.189 | 0.153 | 0.239 | 0.512 | 1.000 | 0.454 | 0.174 | 0.235 | 1.000 | 0.190 | 0.153 |
| mitotic_rate | -0.184 | -0.026 | 0.119 | 0.000 | 1.000 | 0.219 | 0.076 | 0.059 | 0.010 | 0.000 | 1.000 | 0.346 | 0.000 | 0.134 | 1.000 | 0.108 | 0.240 |
| os_status | 0.478 | 0.220 | 0.031 | 0.391 | 0.219 | 1.000 | 0.259 | 0.286 | 0.161 | 0.560 | 0.766 | 0.226 | 0.564 | 0.312 | 0.448 | 0.130 | 0.235 |
| primary_site | 0.240 | 0.086 | 0.113 | 0.189 | 0.076 | 0.259 | 1.000 | 0.281 | 0.094 | 0.313 | 0.399 | 0.254 | 0.200 | 0.197 | 0.263 | 0.089 | 0.063 |
| race | 0.227 | 0.103 | 0.085 | 0.153 | 0.059 | 0.286 | 0.281 | 1.000 | 0.000 | 0.293 | 0.459 | 0.139 | 0.207 | 0.091 | 0.250 | 0.054 | 0.000 |
| sample_coverage | -0.376 | -0.152 | 0.146 | 0.239 | 0.010 | 0.161 | 0.094 | 0.000 | 1.000 | 0.227 | 1.000 | 0.209 | 0.093 | 0.143 | 1.000 | 0.072 | 0.047 |
| sample_type | 0.560 | 0.185 | 0.113 | 0.512 | 0.000 | 0.560 | 0.313 | 0.293 | 0.227 | 1.000 | 0.635 | 0.309 | 0.646 | 0.350 | 0.474 | 0.025 | 0.075 |
| source | 0.662 | 0.172 | 0.142 | 1.000 | 1.000 | 0.766 | 0.399 | 0.459 | 1.000 | 0.635 | 1.000 | 0.598 | 0.861 | 0.598 | 0.504 | 0.000 | 1.000 |
| stage_at_diagnosis | 0.356 | 0.069 | 0.112 | 0.454 | 0.346 | 0.226 | 0.254 | 0.139 | 0.209 | 0.309 | 0.598 | 1.000 | 0.531 | 0.626 | 0.287 | 0.000 | 0.236 |
| treatment | 0.436 | 0.138 | 0.158 | 0.174 | 0.000 | 0.564 | 0.200 | 0.207 | 0.093 | 0.646 | 0.861 | 0.531 | 1.000 | 0.483 | 0.497 | 0.000 | 0.000 |
| treatment_response | 0.517 | 0.115 | 0.125 | 0.235 | 0.134 | 0.312 | 0.197 | 0.091 | 0.143 | 0.350 | 0.598 | 0.626 | 0.483 | 1.000 | 0.080 | 0.088 | 0.112 |
| tumor_grade | 0.459 | 0.149 | 0.141 | 1.000 | 1.000 | 0.448 | 0.263 | 0.250 | 1.000 | 0.474 | 0.504 | 0.287 | 0.497 | 0.080 | 1.000 | 1.000 | 1.000 |
| tumor_purity | -0.008 | 0.071 | 0.000 | 0.190 | 0.108 | 0.130 | 0.089 | 0.054 | 0.072 | 0.025 | 0.000 | 0.000 | 0.000 | 0.088 | 1.000 | 1.000 | -0.002 |
| tumor_size | -0.161 | -0.029 | 0.132 | 0.153 | 0.240 | 0.235 | 0.063 | 0.000 | 0.047 | 0.075 | 1.000 | 0.236 | 0.000 | 0.112 | 1.000 | -0.002 | 1.000 |
Missing values
Sample
| Unnamed: 0 | sample_id | patient_id | age_at_diagnosis | stage_at_diagnosis | tumor_size | mitotic_rate | treatment | treatment_response | primary_site | sample_type | race | gender | metastatic_site | tumor_purity | sample_coverage | os_months | treatment_start | os_status | source | tumor_grade | mutated_genes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 0 | COSS1030183 | 924209 | 69.0 | Unknown | NaN | NaN | IMATINIB | PR | Stomach | Primary | NaN | Male | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 1 | 1 | COSS1030184 | 924209 | 69.0 | Unknown | NaN | NaN | IMATINIB | PR | Stomach | Primary | NaN | Male | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 2 | 2 | COSS1035469 | 929361 | 52.0 | Unknown | NaN | NaN | IMATINIB | NR | Small Intestine | Metastasis | NaN | Female | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 3 | 3 | COSS1035470 | 929361 | 52.0 | Unknown | NaN | NaN | IMATINIB | PR | Small Intestine | Metastasis | NaN | Female | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 4 | 4 | COSS1036012 | 929884 | 57.0 | Unknown | NaN | NaN | IMATINIB | CR | Small Intestine | Unknown | NaN | Female | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 5 | 5 | COSS1036013 | 929885 | 45.0 | Unknown | NaN | NaN | IMATINIB | CR | Stomach | Unknown | NaN | Male | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 6 | 6 | COSS1036014 | 929886 | 51.0 | Unknown | NaN | NaN | IMATINIB | CR | Small Intestine | Unknown | NaN | Female | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 7 | 7 | COSS1046535 | 939749 | 58.0 | Unknown | NaN | NaN | IMATINIB | PR | Small Intestine | Local Recurrence | NaN | Female | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 8 | 8 | COSS1046536 | 939749 | 58.0 | Unknown | NaN | NaN | IMATINIB | NR | Small Intestine | Metastasis | NaN | Female | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| 9 | 9 | COSS1117717 | 1005414 | 57.0 | Unknown | NaN | NaN | IMATINIB | CR | Colon/Rectum | Unknown | NaN | Male | NaN | NaN | NaN | NaN | MISSING_DATE | NaN | COSMIC | Unknown | ['KIT'] |
| Unnamed: 0 | sample_id | patient_id | age_at_diagnosis | stage_at_diagnosis | tumor_size | mitotic_rate | treatment | treatment_response | primary_site | sample_type | race | gender | metastatic_site | tumor_purity | sample_coverage | os_months | treatment_start | os_status | source | tumor_grade | mutated_genes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3871 | 3871 | NaN | eacfb466-572c-43da-8efa-2fa76b54f924 | 72.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Small Intestine | Metastasis | NaN | Female | NaN | 70.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3872 | 3872 | NaN | ef5cdca3-6945-4bcd-9551-ab65ac508399 | 54.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Small Intestine | Metastasis | NaN | Female | NaN | 40.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3873 | 3873 | NaN | f126e7c0-23a5-4966-9b6e-45bea9b4310a | 26.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Soft Tissue | Metastasis | NaN | Male | NaN | 50.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3874 | 3874 | NaN | f29b7559-d664-4ffb-88c0-77a2d2df659b | 66.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Small Intestine | Metastasis | NaN | Male | NaN | 80.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3875 | 3875 | NaN | f33ce1b9-a7ca-4659-b747-c5bba5949585 | 69.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Soft Tissue | Metastasis | NaN | Female | NaN | 60.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3876 | 3876 | NaN | faaa8f08-cfc4-4aec-a350-bfaf7c6458eb | 58.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Stomach | Metastasis | NaN | Male | NaN | 40.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3877 | 3877 | NaN | fae9b190-d0d8-42ac-88b7-0c94f301ff23 | 48.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Soft Tissue | Metastasis | NaN | Male | NaN | 80.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3878 | 3878 | NaN | fbcb378e-b5f2-4c5f-b9a2-c14b5d620198 | 58.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Stomach | Metastasis | NaN | Male | NaN | 70.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3879 | 3879 | NaN | fd9c738b-ac3a-4a95-aa1c-30b37254f639 | 43.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Stomach | Metastasis | NaN | Male | NaN | 60.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |
| 3880 | 3880 | NaN | ffb0514c-62a4-4970-b825-d49a0e570550 | 65.0 | metastasis | NaN | NaN | UNKNOWN | NaN | Retroperitoneum | Metastasis | NaN | Female | NaN | 60.0 | NaN | NaN | MISSING_DATE | NaN | GDC | Unknown | NaN |